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FULL CONTENTS 
[Claim(s)] 

[Claim 1] Two or more decision-making subjects exist, and a decision-making subject has two 
or more action alternative, respectively. In the task request in two or more decision-making 
subject environment and the task undertaking action determination method of determining 
each one of gains by combination of action which each decision-making subject took, and 
determining action that the gain will become large [ without describing the combination of 
action, and the relation of gain comprehensively to all the states of a decision-making subject, 
when describing the gain to the combination of action ] Express the method of change of the 
gain by action using a change rule, and problem expression is made simple by calculating and 
drawing the combination of the action to a future state, and the relation of gain using the 
change rule if needed. Moreover, which action should be taken and when calculating, carry out 
backward reasoning from future action, and action is not determined. By evaluating the change 
condition of the future gain which the action in this time causes, and including the evaluation in 
the decision criterion of selection of operation, without determining action only from short-term 
profit and loss at present The task request and the task undertaking action determination 
method in two or more decision-making subject environment which are characterized by 
making the long-term gain of each decision-making subject increase. 
[Claim 2] In the task request in two or more decision-making subject environment and the task 
undertaking action determination method in the system to which two or more computers were 
connected through the network The process which calculates the gain to action after this time 
from the change rule of gain and gain over the action in this time of the decisiori-making 
subject stored in the storage of a computer, The task request and the task undertaking action 
determination method in two or more decision-making subject environment which are 
characterized by having the process which determines action of a decision-making subject that 
the use calculated from the gain by action of the decision-making subject in this time and the 
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future gain change caused by the action will serve as the maximum. 



[Detailed Description of the invention] 
[0001] 

[Field of the Invention] As opposed to the decision problem of the task request on the network 
system of the computer by which interconnection of this invention was carried out, and task 
undertaking When action of whether a computer with the task which should be processed 
performs a data transfer request and the reference request of a database to other computers, 
or to process a task by self is determined or there is a task request from other computers It is 
related with the task request in two or more decision-making subject environment and the task 
undertaking action determination method of determining whether it is taken over and that 
action which is not taken over. 
[0002] 

[Description of the Prior Art] Two or more computers are connected to a network, each 
computer has a task, and the environment where information is exchanged between 
computers for tasking is considered. Suppose that there is also no main computer direct that 
one task takes cooperative action mutually rather than that is given [ and ] to the computer of 
these plurality on the whole. Here, suppose that the data in the database which Computer B 
manages is effective in tasking of Computer A. for example. However, even when he has none 
of this data, tasking presupposes that it is possible. At this time, there is selection of whether 
Computer A requests data reference from Computer B, and performs tasking or for it not to 
request but to perform tasking by self. Moreover, Computer B has the answering selection 
which responds to it, when there is a request. 

[0003] The action which generally serves as action used as self profits and a partner's profits is 
not always in agreement. That is, it is impossible to force a partner computer into action 
advantageous to self. Therefore, a partner computer must choose the action which makes self 
profits the maximum under assumption of acting rationally. Thus, there is a game theory as a 
method of dealing with an action selection problem in case there is a partner. 
[0004] When the decision-making subject of (1) plurality exists, a game theory has two or more 
alternative of respectively action of (2) decision-making subjects and (3) each decision-making 
subject chooses action. With the combination of the action, each gain is determined and a 
decision-making subject besides (4) gives the method of being the basis of the premise of 
acting so that all the members and a use (gain) may be maximized, and determining which 
action should be chosen. Moreover, the method of analyzing how change of a setup of a use 
changing the property of the whole system is given (refer to [Mitsuo Suzuki and "new game 
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theory" Keiso-Shobo, 1994], and [the Toshio Nishida and "theory-of-game" Union of Japanes 
Scientists and Engineers and 1973]). 

[0005] The process of acquisition of action selection and the gain by it is called a game, and a 
sense resolution subject is called a participant. Moreover, each stage which constitutes a 
game, i.e., each action selection opportunity, is called a move. Here, the number of 
participants deals with the game of noncooperation and complete information by 2. It says that 
the agreement of the beforehand which has legal force as it is noncooperative does not exist. It 
is completely unrelated whether it seems to have taken the concerted action as a result or it is 
not visible. Moreover,* complete information means that all the participants know all actions 
chosen in the process before it. 

[0006] There is expression by a game tree as one of the methods showing a game. The 
example of expression of the game by a game tree is shown in drawing 9 R> 9. PA and PB 
The move of the participants A and B in a game is expressed, respectively. A branch 
expresses selectable action. This figure expresses process in which B chooses action ofb1 
orb2, after A chooses action of a1 or a2 first and looks at it. Each gain is decided by 
combination of Participants' A and B action, and it is shown at the endpoint. This is called a 
gain vector. 

[0007] Now, a game has the game which a many steps [ which both sides choose action 1 
time respectively and not only finish, but repeat as a portion the game shown in drawing 9 ] 
hand follows. This is called multistage game. On these Descriptions, the period until both sides 
choose one hand of actions from this time at a time and gain is decided first is called a game 
at present as one unit. 

[0008] Now, when the game tree from which the total of a move is set to N is given, the 
conventional method of making action selection by each move is shown below. Here, it is 
defined as utility = gain and let N be even number. 
[0009] (The action selection method on a game tree) 

(1) Consider it as i<-N. 

(2) the action which looks at a utility vector and makes the use of the participant in this move 
the maximum about each branch point Of Move i - each branch point carry out selection 
behavior. Here, the utility vector is the same as that of a gain vector. 

(3) About each branch point of a move i-1 , choose the action which makes the use of the 
participant in this move the maximum from the utility vectors to the selection behavior of (2), 
and consider it as the selection behavior in each branch point. 

(4) If it is i= 2, it will go to (7) as i<- (i-1). 

(5) If it is not i= 2, the utility vector value over the selection behavior in the move i-1 which 
follows it will be applied to the utility vector to each action at the branch point i-2. 

(6) Go to (2) as i<- (i-2). 
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(7) Let selection behavior in the branch point chosen by Move i be the selection behavior in 
this game. In a move 1 , the number of the branch points is one and let them be the branch 
points which had it chosen. 

(8) If itisi=N, it will end. 

Let the branch point which follows the selection behavior of (9) and (7) be the selection branch 
point in the move i<- (i+1). 
It goes to (10) and (7). 

[0010] In a perfect information game, it can be known in which branch point each participant Is, 
and the participant can draw the justification in the noncooperation game of the upper method 
at each branch point from the fact of choosing the action which always maximizes a self use 
henceforth [ it ]. 

[0011] If the example of drawing 9 is referred to, since the use of action b2 is larger than b1 for 
B, by a move 2 (PB), action b2 will be chosen at both of the branch points. Next, in a move 1 
(PA), it becomes comparison of a1 (utility vector (0, 8)) and a2 (utility vector (2, 2)). For A, 
since the use is [ a2 J larger, it is chosen here, in this game, action sequence a2 ->b2 appear 
after all. 
[0012] 

[Problem to be solved by the invention] Now, man performs input process of a gain vector and 
a computer carries out to calculating action selection. The block diagram in the case of 
realizing using the conventional technology is shown in drawing 10 R> 0. When it is going to 
express a game as a game tree, the following problems arise. 
[0013] (a) Generally the action taken at present changes the gain vector of the game 
performed after it. Therefore, it is required to be able to deal with the change. Treatment of 
carrying out by preparing only one game with a small size and repeating it is inadequate. 
[0014] (b) Since a game tree will become huge if severaIN of a move becomes large, it 
becomes difficult to write down on the form developed beforehand. 

[0015] (c) If the group of a participant's condition and the gain vector of the game performed 
there is described, a game tree can be developed if needed. However, if a state number 
increases, the amount of description of the group of a state and a gain vector will become 
huge, and management will become difficult. 
[0016] Moreover, when calculating, the following problems arise. 

[0017] (d) Calculation of the conventional method about the above-mentioned action selection 
advances backward first from the last move N. However, since not every participant has the 
legal force which continues making a participant participate in a game, it cannot know the 
move of which will be performed beforehand. Therefore, when N is not given, it cannot be 
determined where calculation should be begun from. 

[0018] (e) If N becomes large even when the total N of the move is known, a game tree will 
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become huge, computational complexity will increase, and it will come to require time by action 
determination. 

[0019] (f) When giving up the calculation from the last move N and making an action decision 
about a game at present, future profit and loss cannot be included in calculation. Therefore, 
even if it loses in the short term, selection of the action which obtains gain in the long run 
becomes impossible. 

[0020] [ the place which this Invention was made in view of the above, and is made into the 
purpose ] While making decisions by introducing the description under the change rule of gain 
for a short time, selection of the action gained in the long run is attained, and it is in offering the 
task request in two or more decision-making subject environment and the task undertaking 
action determination inethod which may increase the gain which can be gained in the long run. 

[0021] 

[Means for solving problem] In order to attain the above-mentioned purpose, [ this invention 
according to claim 1 ] Two or more decision-making subjects exist, and a decision-making 
subject has two or more action alternative, respectively. In the task request in two or more 
decision-making subject environment and the task undertaking action determination method of 
determining each one of gains by combination of action which each decision-making subject 
took, and determining action that the gain will become large [ without describing the 
combination of action, and the relation of gain comprehensively to all the states of a decision- 
making subject, when describing the gain to the combination of action ] Express the method of 
change of the gain by action using a change rule, and problem expression is made simple by 
calculating and drawing the combination of the action to a future state, and the relation of gain 
using the change rule if needed. Moreover, which action should be taken and when 
calculating, carry out backward reasoning from future action, and action is not determined. Let 
it be a summary to make the long-term gain of each decision-making subject increase by 
evaluating the change condition of the future gain which the action in this time causes, and 
including the evaluation in the decision criterion of selection of operation, without determining 
action only from short-term profit and loss at present. 

[0022] If it is in this invention according to claim 1 , the method of change of the gain by action 
is expressed using a change rule. [ or ] when calculating which action problem expression is 
made simple and should be taken by calculating and drawing the combination of the action to 
a future state, and the relation of gain using the change rule if needed The long-term gain of 
each decision-making subject is made to increase by evaluating the change condition of the 
future gain which the action in this time causes, and including the evaluation in the decision 
criterion of selection of operation. 

[0023] Moreover, this invention according to claim 2 is set to the task request in two or more 
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decision-making subject environment and tlie task undertaking action determination method in 
the system to which two or more computers were connected through the network. The process 
which calculates the gain to action after this time from the change rule of gain and gain over 
the action in this time of the decision-making subject stored in the storage of a computer, Let it 
be a summary to have the process which determines action of a decision-making subject that 
the use calculated from the gain by action of the decision-making subject in this time and the 
future gain change caused by the action will serve as the maximum. 

[0024] If it is in this invention according to claim 2, the gain to action after this time is calculated 
from the change rule of gain and gain over the action in this time of a decision-making subject. 
Action of a decision-making subject is determined that the use calculated from the gain by 
action of the decision-making subject in this time and the future gain change caused by the 
action will serve as the maximum. 
[0025] 

[Mode for carrying out the invention] The form of operation of this invention is hereafter 
explained using Drawings. 

[0026] Drawing 1 is the block diagram showing the composition of the equipment which 
enforces the task request in two or more decision-making subject environment and the task 
undertaking action determination method concerning one embodiment of this invention. The 
equipment shown in this figure has the storage 1, I/O device 3, the arithmetic unit 5 that 
performs utility calculus, and the communication device 7 which communicates with a partner 
computer, and it differs in that added the change rule of the gain vector to the database of the 
game tree, and it was newly added to the storage 1 to the conventional equipment shown in 
drawing 10 . Thus, by newly adding the method of describing the change rule of a gain vector 
independently, expression by the game tree of a game can be made easy. If this performs a 
game at present and a certain action is performed, it will describe how a gain vector changes 
in the game after it. 

[0027] Drawing 2 shows the example of description of the change rule over drawing 9 . 
Drawing 2 (a) is a change rule over A, and drawing 2 (b) is a change rule over B. Moreover, in 
the table of this figure, x shows the value of the gain in the game before it. When action called 
a1 and b1 is taken, the line a1 of a table and the intersection of a sequence b1 are seen. 
About A, it is with an equation x+1, x+2, x-1, and x+1, and corresponds to the formula of the 
gain of (a1, b1) in the next time, (a1, b2), (a2, b1), and (a2, b2), respectively. If it actually 
calculates, it can be found as follows. 

[0028] (a1, b1) : x+1 =5+1 =6 (a1, b2) : x+2=0+2=2 (a2, b1) : x-1 =8-1 =7 (a2, b2) : x+1 =2+1 =3 
[0029] Drawing 3 will be obtained if it calculates to other action groups or B similarly. The gain 
vector of subsequent games is also calculable if needed. 

[0030] Next, the action selection method which added evaluation of the action in this time and 
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the relation of subsequent games is proposed. The action in this time is after it, and for self, 
this evaluates whether it leads to a more advantageous game, i.e., the game which may obtain 
bigger gain, or it does not lead, unites it and evaluation of the profit and loss on the game in 
this time, and performs action selection. The following formulas define a use. 
[0031] 

[Mathematical formula 1] 

Utility = (gain of the game in this time) 

+ ((w) Coefficient of expectation) x (gain change in subsequent games) (1) 

The participant in a game chooses the action which makes this use the maximum. Coefficients 

of expectation differ for every participant, and are adjusted according to each task which it has. 

[0032] The computational procedure of the proposal method is shown below. Here, it is careful 
of a game at present referring to a game until both sides choose one hand of actions at a time 
and gain is decided. 

[0033] (The action selection method which incorporated evaluation of the game change after 
being based on action at present) 

(1) Set a move at present to i. 

(2) Calculate the use of an upper type (1) from the gain of a game at present. 

(3) Let action which makes the use of the participant in this move the maximum be the 
selection behavior in each branch point about each branch point of a move i+1. 

(4) About each branch point of Move i, choose the action which makes the use of the 
participant in this move the maximum from the utility vectors to the selection behavior of (3), 
and consider it as the selection behavior in this game. 

(5) Let selection behavior in the branch point which follows the selection behavior in the branch 
point i be the selection behavior in a move i+1 

(6) End. 

[0034] Reduction of the amount of description is expectable from description of only the group 
of a state and a gain vector with introduction of the expression under the above-mentioned 
change rule. If a game tree will be developed if needed, required storage capacity is reducible. 
[0035] Moreover, the following things become possible by the action selection method which ' 
incorporated evaluation of the game change after being based on the above-mentioned action. 

[0036] (a) Since the proposal technique does not perform strict calculation like the 
conventional technology but presumes the increment of future gain, it cannot necessarily 
perform the always right judgment. However, it becomes unnecessary to calculate the whole 
game tree comprehensively like the conventional technology, and computation time is 
reduced. This is effective especially when temporal restrictions, such as control of a physical 
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system, occur. 

[0037] (b) If there is action which may obtain big profits in the future even if it stops being 
judgment only for profit and loss at present and loses at present even when the total N of a 
move is not known beforehand, a possibility of choosing it will arise. 

[0038] (c) By operation of a coefficient of expectation, the suitable action selection according to 
the load in a participant's this time becomes possible. For example, if generous at present, the 
degree of expectation can be enlarged and it can respond to the high request of the load from 
a partner cooperatively. If a partner also uses the same action selection method, a self load is 
high, and a possibility that a partner will consent will increase to distribute a task to a partner. 
Moreover, if hard-pressed, a coefficient of expectation can be made small, the request from a 
partner can be refused, and it can concentrate on self tasking. 

[0039] Next, a distribution-of-information problem is considered between Computer A and 
Computer B as a concrete example. Problem setting here is shown below. 

(1) Computer A and Computer B are connected to the same network. 

(2) Computer A advances the data reference demand of the database which B manages to the 
partner computer B for tasking. 

(3) If a reference request is taken out to a partner, cost (negative gain) will generate Computer 
A. 

(4) If Computer B accepts the database reference request of a partner computer, cost 
(negative gain) will generate it. If the demand of a partner computer is refused, cost will not 
start. 

(5) Since it will be useful for tasking if data can be gained from a partner. Computer A can 
obtain positive gain. 

(6) Computer A passes a packet unnecessary to a network at the time of tasking, and brings a 
load to a partner computer. The grade of a load is based on the knowledge level of Computer 
A. When the knowledge level about the distribution of information on a network is high, the 
load given to the circumference can be made small. 

(7) If Computer B accepts a reference request, the knowledge about the distribution of 
information which accompanies it will be accumulated in Computer A, and the knowledge level 
of A will go up. 

(8) When there is no exchange of information between computers, Computer A cannot acquire 
the knowledge about the distribution of information, but a knowledge level falls. 

[0040] Action of this problem and the relation of gain are shown in drawing 4 . The cost about a 
question and a reply is shown in drawing 4 (a). The gain by data capture is shown in drawing 4 
(b). The initial value of the cost produced in B along with tasking is shown in drawing 4 (c). The 
gain by performing this game once serves as the sum of (a), (b), and (c). 51 of drawing 5 R> 5 
shows the game tree in an initial state. The cost of drawing 4 (c) changes with the knowledge 
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levels of Computer A. The change rule of the cost is shown in drawing 6 . 53 of drawing 5 
shows the game tree' in the next state of the initial state searched for using this rule. 
[0041] It is problems here how Computers A and B should be served under this premise and to 
determine. The procedure of calculation is shown below. 

[0042] (1) Define (the gain change in subsequent games) of a formula (1) as follows. 
[Mathematical formula 2] (Gain change in subsequent games) = (gain vector value of the game 
at the next time) - (gain vector value of a game at present) 

(2) Presume the coefficient of expectation w. This value is calculated by trial and error in the 
process performed by repeating a game. 

(3) Calculate the gain vector of the game at the next time using a change rule. 

(4) Action is determined using action selection method] which incorporated evaluation of 
change of the game after being based on [action. 

(5) If at least one side will stop a game, it will end there. 
It returns to (6) and (3). 

[0043] It is assumed that the coefficient of expectation w= 10 was given and each knows 
Computers A and B about this procedure (2) here. In the procedure (4) which showed the 
example of the procedure (3) by 53 of drawing 5 , when Computer A takes action a2, no 
computers B act. Here, it calculates noting that dummy action which nothing carries out is 
performed. About an initial state, the example which calculated the use is shown in drawing 7 . 
In an initial state, it turns out that the proposal method chooses action a1 and b1 . 
[0044] Next, the proposal method is evaluated. The total N of a move cannot be known before 
a game start. Therefore, the proposal method is compared using the method of making action 
selection only from the gain vector of a game tree at present. Let a valuation basis be the 
average of acquisition gain. 
[0045] 

[Mathematical formula 3] (Average of acquisition gain) =(total of old acquisition gain)/(the 
number of repetitions of a game) 

The number of moves changes with the actions taken among 1-2 about 1 time of a game. 
Therefore, a term called the number of repetitions of a game was used instead of the number 
of moves here. 

[0046] The number of repetitions of a game and the relation of acquisition gain are shown in 
drawing 8 . Action a1 and b1 were chosen by the method of this invention each time, and 
action a2 was chosen by the conventional method each time. A figure shows the following 
things. By the conventional method. Computer A cannot obtain gain at all. Since Computer B 
takes the concerted action b1 by the method of this invention to it, gain acquisition of A is 
possible. Moreover, if the load hung down even if it accompanies tasking of A by taking the 
concerted action b1 becomes small and the number of repetitions increases also about B, gain 
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will become large rather than the conventional method. If drawing 8 (b) is seen, the reverse 
point of the average of gain has happened by 1 1 repetitions. If a game tree is repeated first 
and it develops from this to several 1 1 or more, even if it uses a definition called utility = gain 
not using a formula (1), it turns out that the same result as the method of this invention can be 
drawn by the method of determining action strictly backward from the last move. However, 1 1 
or more game trees with a number of repetitions become quite large, and chain calculation 
time is needed until it takes the first action. In fields, such as control of a physical system, this 
serves as big demerit and the predominancy of the method of this invention is shown. 
[0047] 

[Effect of the Invention] [ according to this invention ] by introducing the description under the 
change rule of gain to the decision problem of the task request and task undertaking action by 
two or more decision-making subjects as explained above [ evaluating the change condition of 
the gain in the game which the amount of description of the correspondence relation between 
an action group and a gain vector is reduced, and is performed henceforth, and including it in 
the valuation basis of action selection ] Even if it loses as in the short term as decision making 
in a short time, selection of the action which it gains in the long run is attained, and it is 
effective in making the gain which can be gained in the long run increase. 



[Brief Description of the Drawings] 

[Drawing 1] It is the block diagram showing the composition of the equipment which enforces 
the task request in two or more decision-making subject environment and the task undertaking 
action determination method concerning one embodiment of this invention. 
[Drawing 2] It is the figure showing the style of the change rule of a gain vector. 
[Drawing 3] It is the figure showing the example computation of the gain vector using the 
change rule of the gain vector. 

[Drawing 4] It is the figure showing the component of the gain vector in an example. 
[Drawing 5 ] It is the figure showing the game tree expression in the initial state of an example, 
and game tree expression in the following state. 

[Drawing 6] It is the figure showing the example computation of the gain vector in an example. 
[Drawing 7] It is the figure showing the example computation of the utility vector in an example. 

[Drawing 8] It is the figure showing the comparison of the conventional method about change 
of acquisition gain, and the method of this invention to the number of repetitions of a game. 
[Drawing 9] It is the figure showing the example of expression of the game by a game tree. 
[Drawing 10] It is the figure showing the composition of the equipment by the conventional 
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technology. 

[Explanations of letters or numerals] 

1 Storage 

3 I/O Device 

5 Arithmetic Unit 

7 Communication Device 
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[Drawing 41 
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[Drawing 3] 
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[Drawing 6] 
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